SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms
نویسندگان
چکیده
Singing synthesis is a rising musical art form gaining popularity amongst composers and end-listeners alike. To date, this art form is largely confined to offline boundaries of the music studio, whereas a large part music is about live performances. This calls for a real-time synthesis system readily deployable for onstage applications. SERAPHIM is a wavetable synthesis system that is lightweight and deployable on mobile platforms. Apart from conventional offline studio applications, SERAPHIM also supports real-time synthesis applications, enabling live control inputs for on-stage performances. It also provides for easy lip animation control. SERAPHIM will be made available as a toolbox on Unity 3D for easy adoption into game development across multiple platforms. A readily compiled version will also be deployed as a VST studio plugin, directly addressing end users. It currently supports Japanese (singing only) and Mandarin (speech and singing) languages. This paper describes our work on SERAPHIM and discusses its capabilities and applications.
منابع مشابه
SERAPHIM Live! - Singing Synthesis for the Performer, the Composer, and the 3D Game Developer
The human singing voice is highly expressive instrument capable of producing a variety of complex timbres. Singing synthesis today is popular amongst composers and studio musicians accessing the technology by means of offline sequencing platforms. Only a couple of singing synthesizers are known to be equipped with both the real-time capability and the user interface to successfully target live ...
متن کاملText2Video: Text-Driven Facial Animation using MPEG-4
We present a complete system for the automatic creation of talking head video sequences from text messages. Our system converts the text into MPEG-4 Facial Animation Parameters and synthetic voice. A user selected 3D character will perform lip movements synchronized to the speech data. The 3D models created from a single image vary from realistic people to cartoon characters. A voice selection ...
متن کاملWeb-enabled 3D talking avatars based on WebGL and HTML5
We describe a system for plugin-free deployment of 3D talking characters on the web. The system employs the WebGL capabilites of modern web browsers in order to produce real-time animation of speech movements, in synchrony with text-to-speech synthesis, played back using HTML5 audio functionalty. The implementation is divided into a client and a server part, where the server delivers the audio ...
متن کاملA Mobile and Fog-based Computing Method to Execute Smart Device Applications in a Secure Environment
With the rapid growth of smart device and Internet of things applications, the volume of communication and data in networks have increased. Due to the network lag and massive demands, centralized and traditional cloud computing architecture are not accountable to the high users' demands and not proper for execution of delay-sensitive and real time applications. To resolve these challenges, we p...
متن کاملCarnival-combining speech technology and computer animation.
Speech is powerful information technology and the basis of human interaction. By emitting streams of buzzing, popping, and hissing noises from our mouths, we transmit thoughts, intentions, and knowledge of the world from one mind to another. We’re accustomed to thinking of speech as an acoustic, auditory phenomenon. However, speech is also visible. Although the primary function of speech is to ...
متن کامل